Overview

Dataset statistics

Number of variables17
Number of observations17000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.2 MiB
Average record size in memory136.0 B

Variable types

Categorical8
Numeric9

Alerts

N-P-K Ratio has a high cardinality: 11172 distinct valuesHigh cardinality
Yield has a high cardinality: 5004 distinct valuesHigh cardinality
Temperature is highly overall correlated with Rainfall and 8 other fieldsHigh correlation
Rainfall is highly overall correlated with Temperature and 7 other fieldsHigh correlation
pH is highly overall correlated with Temperature and 8 other fieldsHigh correlation
Light_Hours is highly overall correlated with Temperature and 10 other fieldsHigh correlation
Light_Intensity is highly overall correlated with Temperature and 6 other fieldsHigh correlation
Rh is highly overall correlated with Rainfall and 6 other fieldsHigh correlation
Nitrogen is highly overall correlated with NameHigh correlation
Phosphorus is highly overall correlated with pH and 2 other fieldsHigh correlation
Potassium is highly overall correlated with Temperature and 8 other fieldsHigh correlation
Soil_Type is highly overall correlated with Rainfall and 6 other fieldsHigh correlation
Fertility is highly overall correlated with Temperature and 7 other fieldsHigh correlation
Photoperiod is highly overall correlated with Temperature and 6 other fieldsHigh correlation
Category_pH is highly overall correlated with Temperature and 5 other fieldsHigh correlation
Season is highly overall correlated with NameHigh correlation
Name is highly overall correlated with Temperature and 12 other fieldsHigh correlation
Yield is highly imbalanced (53.6%)Imbalance
Name is uniformly distributedUniform
Temperature has unique valuesUnique
Rainfall has unique valuesUnique
pH has unique valuesUnique
Light_Hours has unique valuesUnique
Light_Intensity has unique valuesUnique
Rh has unique valuesUnique
Phosphorus has unique valuesUnique
Potassium has unique valuesUnique

Reproduction

Analysis started2024-05-31 20:19:14.127703
Analysis finished2024-05-31 20:19:38.857708
Duration24.73 seconds
Software versionydata-profiling vv4.0.0
Download configurationconfig.json

Variables

Soil_Type
Categorical

HIGH CORRELATION 

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
High
6500 
Moderate
4500 
Loamy
2500 
Sandy Loam
1500 
moderate
1000 
Other values (2)
1000 

Length

Max length10
Median length8
Mean length6.1764706
Min length4

Characters and Unicode

Total characters105000
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSandy Loam
2nd rowSandy Loam
3rd rowSandy Loam
4th rowSandy Loam
5th rowSandy Loam

Common Values

ValueCountFrequency (%)
High 6500
38.2%
Moderate 4500
26.5%
Loamy 2500
 
14.7%
Sandy Loam 1500
 
8.8%
moderate 1000
 
5.9%
Sandy loam 500
 
2.9%
Sandy 500
 
2.9%

Length

2024-05-31T20:19:38.964875image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-31T20:19:39.174291image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
high 6500
34.2%
moderate 5500
28.9%
loamy 2500
 
13.2%
sandy 2500
 
13.2%
loam 2000
 
10.5%

Most occurring characters

ValueCountFrequency (%)
a 12500
11.9%
e 11000
10.5%
o 10000
 
9.5%
d 8000
 
7.6%
i 6500
 
6.2%
H 6500
 
6.2%
h 6500
 
6.2%
g 6500
 
6.2%
r 5500
 
5.2%
t 5500
 
5.2%
Other values (8) 26500
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 85500
81.4%
Uppercase Letter 17500
 
16.7%
Space Separator 2000
 
1.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 12500
14.6%
e 11000
12.9%
o 10000
11.7%
d 8000
9.4%
i 6500
7.6%
h 6500
7.6%
g 6500
7.6%
r 5500
6.4%
t 5500
6.4%
m 5500
6.4%
Other values (3) 8000
9.4%
Uppercase Letter
ValueCountFrequency (%)
H 6500
37.1%
M 4500
25.7%
L 4000
22.9%
S 2500
 
14.3%
Space Separator
ValueCountFrequency (%)
2000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 103000
98.1%
Common 2000
 
1.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 12500
12.1%
e 11000
10.7%
o 10000
9.7%
d 8000
 
7.8%
i 6500
 
6.3%
H 6500
 
6.3%
h 6500
 
6.3%
g 6500
 
6.3%
r 5500
 
5.3%
t 5500
 
5.3%
Other values (7) 24500
23.8%
Common
ValueCountFrequency (%)
2000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 105000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 12500
11.9%
e 11000
10.5%
o 10000
 
9.5%
d 8000
 
7.6%
i 6500
 
6.2%
H 6500
 
6.2%
h 6500
 
6.2%
g 6500
 
6.2%
r 5500
 
5.2%
t 5500
 
5.2%
Other values (8) 26500
25.2%

Fertility
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
Short Day Period, Day Neutral
8500 
High
3500 
moderate
1500 
Short Day Period
1500 
Short Day Period, Day Neutral, Long Day Period
1500 

Length

Max length46
Median length37.5
Mean length22.323529
Min length4

Characters and Unicode

Total characters379500
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowHigh
2nd rowHigh
3rd rowHigh
4th rowHigh
5th rowHigh

Common Values

ValueCountFrequency (%)
Short Day Period, Day Neutral 8500
50.0%
High 3500
20.6%
moderate 1500
 
8.8%
Short Day Period 1500
 
8.8%
Short Day Period, Day Neutral, Long Day Period 1500
 
8.8%
Day Neutral, Long Day Period 500
 
2.9%

Length

2024-05-31T20:19:39.855742image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-31T20:19:40.039095image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
day 24000
36.1%
period 13500
20.3%
short 11500
17.3%
neutral 10500
15.8%
high 3500
 
5.3%
long 2000
 
3.0%
moderate 1500
 
2.3%

Most occurring characters

ValueCountFrequency (%)
49500
13.0%
r 37000
 
9.7%
a 36000
 
9.5%
o 28500
 
7.5%
e 27000
 
7.1%
D 24000
 
6.3%
y 24000
 
6.3%
t 23500
 
6.2%
i 17000
 
4.5%
h 15000
 
4.0%
Other values (12) 98000
25.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 253000
66.7%
Uppercase Letter 65000
 
17.1%
Space Separator 49500
 
13.0%
Other Punctuation 12000
 
3.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 37000
14.6%
a 36000
14.2%
o 28500
11.3%
e 27000
10.7%
y 24000
9.5%
t 23500
9.3%
i 17000
6.7%
h 15000
5.9%
d 15000
5.9%
u 10500
 
4.2%
Other values (4) 19500
7.7%
Uppercase Letter
ValueCountFrequency (%)
D 24000
36.9%
P 13500
20.8%
S 11500
17.7%
N 10500
16.2%
H 3500
 
5.4%
L 2000
 
3.1%
Space Separator
ValueCountFrequency (%)
49500
100.0%
Other Punctuation
ValueCountFrequency (%)
, 12000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 318000
83.8%
Common 61500
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 37000
11.6%
a 36000
11.3%
o 28500
9.0%
e 27000
 
8.5%
D 24000
 
7.5%
y 24000
 
7.5%
t 23500
 
7.4%
i 17000
 
5.3%
h 15000
 
4.7%
d 15000
 
4.7%
Other values (10) 71000
22.3%
Common
ValueCountFrequency (%)
49500
80.5%
, 12000
 
19.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 379500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
49500
13.0%
r 37000
 
9.7%
a 36000
 
9.5%
o 28500
 
7.5%
e 27000
 
7.1%
D 24000
 
6.3%
y 24000
 
6.3%
t 23500
 
6.2%
i 17000
 
4.5%
h 15000
 
4.0%
Other values (12) 98000
25.8%

Photoperiod
Categorical

HIGH CORRELATION 

Distinct13
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
Short Day Period
5000 
10-10-2010
5000 
10:10:10
1500 
05-10-2005
1000 
75:37.5:37.5
 
500
Other values (8)
4000 

Length

Max length16
Median length12
Mean length11.264706
Min length7

Characters and Unicode

Total characters191500
Distinct characters24
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowShort Day Period
2nd rowShort Day Period
3rd rowShort Day Period
4th rowShort Day Period
5th rowShort Day Period

Common Values

ValueCountFrequency (%)
Short Day Period 5000
29.4%
10-10-2010 5000
29.4%
10:10:10 1500
 
8.8%
05-10-2005 1000
 
5.9%
75:37.5:37.5 500
 
2.9%
5:10:10 500
 
2.9%
5:10:05 500
 
2.9%
22:12:13 500
 
2.9%
8:15:36 500
 
2.9%
13:13:13 500
 
2.9%
Other values (3) 1500
 
8.8%

Length

2024-05-31T20:19:40.232344image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
short 5000
18.5%
day 5000
18.5%
period 5000
18.5%
10-10-2010 5000
18.5%
10:10:10 1500
 
5.6%
05-10-2005 1000
 
3.7%
75:37.5:37.5 500
 
1.9%
5:10:10 500
 
1.9%
5:10:05 500
 
1.9%
22:12:13 500
 
1.9%
Other values (5) 2500
9.3%

Most occurring characters

ValueCountFrequency (%)
0 36500
19.1%
1 26500
13.8%
- 15000
 
7.8%
o 10000
 
5.2%
r 10000
 
5.2%
10000
 
5.2%
2 10000
 
5.2%
: 9000
 
4.7%
5 6000
 
3.1%
S 5000
 
2.6%
Other values (14) 53500
27.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 86500
45.2%
Lowercase Letter 55000
28.7%
Dash Punctuation 15000
 
7.8%
Uppercase Letter 15000
 
7.8%
Space Separator 10000
 
5.2%
Other Punctuation 10000
 
5.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 10000
18.2%
r 10000
18.2%
h 5000
9.1%
d 5000
9.1%
i 5000
9.1%
e 5000
9.1%
y 5000
9.1%
a 5000
9.1%
t 5000
9.1%
Decimal Number
ValueCountFrequency (%)
0 36500
42.2%
1 26500
30.6%
2 10000
 
11.6%
5 6000
 
6.9%
3 3500
 
4.0%
6 2000
 
2.3%
7 1500
 
1.7%
8 500
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
S 5000
33.3%
P 5000
33.3%
D 5000
33.3%
Other Punctuation
ValueCountFrequency (%)
: 9000
90.0%
. 1000
 
10.0%
Dash Punctuation
ValueCountFrequency (%)
- 15000
100.0%
Space Separator
ValueCountFrequency (%)
10000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 121500
63.4%
Latin 70000
36.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0 36500
30.0%
1 26500
21.8%
- 15000
12.3%
10000
 
8.2%
2 10000
 
8.2%
: 9000
 
7.4%
5 6000
 
4.9%
3 3500
 
2.9%
6 2000
 
1.6%
7 1500
 
1.2%
Other values (2) 1500
 
1.2%
Latin
ValueCountFrequency (%)
o 10000
14.3%
r 10000
14.3%
S 5000
7.1%
h 5000
7.1%
d 5000
7.1%
i 5000
7.1%
e 5000
7.1%
P 5000
7.1%
y 5000
7.1%
a 5000
7.1%
Other values (2) 10000
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 191500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 36500
19.1%
1 26500
13.8%
- 15000
 
7.8%
o 10000
 
5.2%
r 10000
 
5.2%
10000
 
5.2%
2 10000
 
5.2%
: 9000
 
4.7%
5 6000
 
3.1%
S 5000
 
2.6%
Other values (14) 53500
27.9%

N-P-K Ratio
Categorical

HIGH CARDINALITY 

Distinct11172
Distinct (%)65.7%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
10:10:10
4000 
5:10:10
 
1000
5:14:04
 
4
9:22:03
 
3
7:37:47
 
3
Other values (11167)
11990 

Length

Max length8
Median length8
Mean length7.6510588
Min length7

Characters and Unicode

Total characters130068
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10379 ?
Unique (%)61.1%

Sample

1st row10:10:10
2nd row10:10:10
3rd row10:10:10
4th row10:10:10
5th row10:10:10

Common Values

ValueCountFrequency (%)
10:10:10 4000
 
23.5%
5:10:10 1000
 
5.9%
5:14:04 4
 
< 0.1%
9:22:03 3
 
< 0.1%
7:37:47 3
 
< 0.1%
18:36:46 3
 
< 0.1%
22:10:18 3
 
< 0.1%
10:59:28 3
 
< 0.1%
0:29:23 3
 
< 0.1%
13:08:20 3
 
< 0.1%
Other values (11162) 11975
70.4%

Length

2024-05-31T20:19:40.389997image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
10:10:10 4000
 
23.5%
5:10:10 1000
 
5.9%
5:14:04 4
 
< 0.1%
3:08:20 3
 
< 0.1%
9:32:40 3
 
< 0.1%
6:26:28 3
 
< 0.1%
5:18:52 3
 
< 0.1%
8:03:27 3
 
< 0.1%
12:49:04 3
 
< 0.1%
15:34:24 3
 
< 0.1%
Other values (11162) 11975
70.4%

Most occurring characters

ValueCountFrequency (%)
: 34000
26.1%
1 27026
20.8%
0 21858
16.8%
2 9932
 
7.6%
5 8381
 
6.4%
3 7872
 
6.1%
4 7396
 
5.7%
7 3438
 
2.6%
6 3436
 
2.6%
8 3386
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 96068
73.9%
Other Punctuation 34000
 
26.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 27026
28.1%
0 21858
22.8%
2 9932
 
10.3%
5 8381
 
8.7%
3 7872
 
8.2%
4 7396
 
7.7%
7 3438
 
3.6%
6 3436
 
3.6%
8 3386
 
3.5%
9 3343
 
3.5%
Other Punctuation
ValueCountFrequency (%)
: 34000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 130068
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
: 34000
26.1%
1 27026
20.8%
0 21858
16.8%
2 9932
 
7.6%
5 8381
 
6.4%
3 7872
 
6.1%
4 7396
 
5.7%
7 3438
 
2.6%
6 3436
 
2.6%
8 3386
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 130068
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
: 34000
26.1%
1 27026
20.8%
0 21858
16.8%
2 9932
 
7.6%
5 8381
 
6.4%
3 7872
 
6.1%
4 7396
 
5.7%
7 3438
 
2.6%
6 3436
 
2.6%
8 3386
 
2.6%

Temperature
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean789.19323
Minimum7.8165889
Maximum2553.3185
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:40.570302image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum7.8165889
5-th percentile18.182464
Q124.703671
median789.65874
Q31148.7923
95-th percentile1852.416
Maximum2553.3185
Range2545.5019
Interquartile range (IQR)1124.0886

Descriptive statistics

Standard deviation597.84056
Coefficient of variation (CV)0.75753382
Kurtosis-0.75702227
Mean789.19323
Median Absolute Deviation (MAD)437.14067
Skewness0.17903404
Sum13416285
Variance357413.34
MonotonicityNot monotonic
2024-05-31T20:19:40.772693image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21.04003128 1
 
< 0.1%
1013.236139 1
 
< 0.1%
1022.339262 1
 
< 0.1%
1064.188845 1
 
< 0.1%
973.9528951 1
 
< 0.1%
1004.968663 1
 
< 0.1%
990.1127187 1
 
< 0.1%
942.3268484 1
 
< 0.1%
1046.009864 1
 
< 0.1%
973.7467293 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
7.816588897 1
< 0.1%
8.690023184 1
< 0.1%
10.01686291 1
< 0.1%
10.14237299 1
< 0.1%
10.34066779 1
< 0.1%
11.10325114 1
< 0.1%
11.20294215 1
< 0.1%
11.23693309 1
< 0.1%
11.24321073 1
< 0.1%
11.42717053 1
< 0.1%
ValueCountFrequency (%)
2553.318516 1
< 0.1%
2481.070128 1
< 0.1%
2460.967189 1
< 0.1%
2430.275328 1
< 0.1%
2424.832371 1
< 0.1%
2407.957071 1
< 0.1%
2402.775059 1
< 0.1%
2385.554267 1
< 0.1%
2382.883441 1
< 0.1%
2381.874806 1
< 0.1%

Rainfall
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean269.80269
Minimum3.0383672
Maximum1587.8441
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:40.974316image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum3.0383672
5-th percentile4.1354813
Q16.3106823
median6.6871317
Q3635.41404
95-th percentile1100.714
Maximum1587.8441
Range1584.8057
Interquartile range (IQR)629.10336

Descriptive statistics

Standard deviation429.85796
Coefficient of variation (CV)1.5932308
Kurtosis-0.052465377
Mean269.80269
Median Absolute Deviation (MAD)0.73644495
Skewness1.2238016
Sum4586645.7
Variance184777.87
MonotonicityNot monotonic
2024-05-31T20:19:41.174131image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
487.3250733 1
 
< 0.1%
6.055582669 1
 
< 0.1%
6.251191112 1
 
< 0.1%
5.791701727 1
 
< 0.1%
5.890309377 1
 
< 0.1%
5.769991417 1
 
< 0.1%
5.90619678 1
 
< 0.1%
5.913955691 1
 
< 0.1%
6.054257729 1
 
< 0.1%
5.708074968 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
3.038367156 1
< 0.1%
3.051387011 1
< 0.1%
3.065755465 1
< 0.1%
3.080045544 1
< 0.1%
3.081249358 1
< 0.1%
3.087128863 1
< 0.1%
3.102323383 1
< 0.1%
3.10582515 1
< 0.1%
3.110565826 1
< 0.1%
3.128124361 1
< 0.1%
ValueCountFrequency (%)
1587.844114 1
< 0.1%
1583.280092 1
< 0.1%
1578.528063 1
< 0.1%
1576.41356 1
< 0.1%
1575.045037 1
< 0.1%
1563.698381 1
< 0.1%
1554.895173 1
< 0.1%
1551.122326 1
< 0.1%
1547.381386 1
< 0.1%
1544.063304 1
< 0.1%

pH
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.273813
Minimum4.9086682
Maximum16.293481
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:41.402318image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum4.9086682
5-th percentile5.9476157
Q16.53736
median12.023648
Q313.037446
95-th percentile13.695855
Maximum16.293481
Range11.384813
Interquartile range (IQR)6.500086

Descriptive statistics

Standard deviation3.1688944
Coefficient of variation (CV)0.30844385
Kurtosis-1.686657
Mean10.273813
Median Absolute Deviation (MAD)1.4318424
Skewness-0.33196022
Sum174654.82
Variance10.041892
MonotonicityNot monotonic
2024-05-31T20:19:41.614439image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.522557831 1
 
< 0.1%
13.13812111 1
 
< 0.1%
12.49214993 1
 
< 0.1%
12.63541853 1
 
< 0.1%
12.80795811 1
 
< 0.1%
12.82772683 1
 
< 0.1%
13.07056563 1
 
< 0.1%
12.73399696 1
 
< 0.1%
12.63112677 1
 
< 0.1%
12.43151874 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
4.908668224 1
< 0.1%
5.060424398 1
< 0.1%
5.074020078 1
< 0.1%
5.077365751 1
< 0.1%
5.103282844 1
< 0.1%
5.113744425 1
< 0.1%
5.135914903 1
< 0.1%
5.141645827 1
< 0.1%
5.144757372 1
< 0.1%
5.156359215 1
< 0.1%
ValueCountFrequency (%)
16.29348099 1
< 0.1%
15.97914268 1
< 0.1%
15.90051245 1
< 0.1%
15.88411817 1
< 0.1%
15.78343873 1
< 0.1%
15.78212733 1
< 0.1%
15.72842653 1
< 0.1%
15.59242559 1
< 0.1%
15.5893546 1
< 0.1%
15.58809081 1
< 0.1%

Light_Hours
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean297.91633
Minimum5.7330288
Maximum1026.6339
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:41.824187image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum5.7330288
5-th percentile6.8137606
Q19.1311838
median248.93405
Q3508.98591
95-th percentile817.62212
Maximum1026.6339
Range1020.9009
Interquartile range (IQR)499.85472

Descriptive statistics

Standard deviation275.21588
Coefficient of variation (CV)0.9238026
Kurtosis-0.65375358
Mean297.91633
Median Absolute Deviation (MAD)239.96268
Skewness0.67809318
Sum5064577.7
Variance75743.782
MonotonicityNot monotonic
2024-05-31T20:19:42.034116image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6.957984967 1
 
< 0.1%
266.6742487 1
 
< 0.1%
221.3760654 1
 
< 0.1%
256.3818406 1
 
< 0.1%
248.2142882 1
 
< 0.1%
230.5299224 1
 
< 0.1%
247.0168503 1
 
< 0.1%
231.4837146 1
 
< 0.1%
259.9598987 1
 
< 0.1%
239.6336499 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
5.733028778 1
< 0.1%
5.96375288 1
< 0.1%
5.982744778 1
< 0.1%
6.035096338 1
< 0.1%
6.039988489 1
< 0.1%
6.055548583 1
< 0.1%
6.118049313 1
< 0.1%
6.119006743 1
< 0.1%
6.119659435 1
< 0.1%
6.121442131 1
< 0.1%
ValueCountFrequency (%)
1026.633918 1
< 0.1%
1001.545013 1
< 0.1%
995.581332 1
< 0.1%
984.807533 1
< 0.1%
980.7155321 1
< 0.1%
977.54756 1
< 0.1%
975.7297802 1
< 0.1%
975.1516711 1
< 0.1%
971.6586556 1
< 0.1%
969.7277318 1
< 0.1%

Light_Intensity
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean188.804
Minimum31.016205
Maximum851.2711
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:42.247941image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum31.016205
5-th percentile44.215018
Q188.291391
median92.712352
Q3338.89933
95-th percentile562.14523
Maximum851.2711
Range820.25489
Interquartile range (IQR)250.60794

Descriptive statistics

Standard deviation177.70304
Coefficient of variation (CV)0.94120379
Kurtosis0.10194782
Mean188.804
Median Absolute Deviation (MAD)6.2276678
Skewness1.2387191
Sum3209668
Variance31578.37
MonotonicityNot monotonic
2024-05-31T20:19:42.437759image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
632.013531 1
 
< 0.1%
92.64395318 1
 
< 0.1%
92.82327006 1
 
< 0.1%
92.52280829 1
 
< 0.1%
93.70186234 1
 
< 0.1%
91.20472213 1
 
< 0.1%
93.36622819 1
 
< 0.1%
92.10441262 1
 
< 0.1%
91.61301145 1
 
< 0.1%
92.35982837 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
31.01620528 1
< 0.1%
31.0713212 1
< 0.1%
31.08633482 1
< 0.1%
31.32046767 1
< 0.1%
31.40298199 1
< 0.1%
31.46019724 1
< 0.1%
31.51555026 1
< 0.1%
31.72765476 1
< 0.1%
31.74696032 1
< 0.1%
31.81529862 1
< 0.1%
ValueCountFrequency (%)
851.2710976 1
< 0.1%
812.2731013 1
< 0.1%
805.1010658 1
< 0.1%
804.3248189 1
< 0.1%
798.2941434 1
< 0.1%
793.2642278 1
< 0.1%
789.6126354 1
< 0.1%
787.0514667 1
< 0.1%
783.3224218 1
< 0.1%
781.5103094 1
< 0.1%

Rh
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean115.25283
Minimum40.382485
Maximum310.9134
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:42.645534image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum40.382485
5-th percentile51.009775
Q167.653446
median111.70792
Q3154.78994
95-th percentile198.78191
Maximum310.9134
Range270.53092
Interquartile range (IQR)87.136491

Descriptive statistics

Standard deviation55.467886
Coefficient of variation (CV)0.48127136
Kurtosis1.4266691
Mean115.25283
Median Absolute Deviation (MAD)43.47636
Skewness1.0313538
Sum1959298.1
Variance3076.6864
MonotonicityNot monotonic
2024-05-31T20:19:42.860599image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
54.62175171 1
 
< 0.1%
178.8721185 1
 
< 0.1%
186.7098905 1
 
< 0.1%
168.8841948 1
 
< 0.1%
166.3697452 1
 
< 0.1%
184.1680032 1
 
< 0.1%
176.9196417 1
 
< 0.1%
176.1770366 1
 
< 0.1%
176.8426398 1
 
< 0.1%
176.8082888 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
40.38248531 1
< 0.1%
40.43995901 1
< 0.1%
40.59643636 1
< 0.1%
40.78700734 1
< 0.1%
41.44855582 1
< 0.1%
41.68929533 1
< 0.1%
41.76914579 1
< 0.1%
41.92490108 1
< 0.1%
41.99350635 1
< 0.1%
42.04021669 1
< 0.1%
ValueCountFrequency (%)
310.9134025 1
< 0.1%
310.0045308 1
< 0.1%
309.024463 1
< 0.1%
308.2586187 1
< 0.1%
308.2087899 1
< 0.1%
307.265658 1
< 0.1%
307.0978 1
< 0.1%
307.0817122 1
< 0.1%
306.8526662 1
< 0.1%
306.8268244 1
< 0.1%

Nitrogen
Real number (ℝ)

HIGH CORRELATION 

Distinct16999
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean115.76199
Minimum19.812274
Maximum397.72802
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:43.060398image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum19.812274
5-th percentile33.493843
Q154.114753
median100.83467
Q3148.0795
95-th percentile332.74341
Maximum397.72802
Range377.91575
Interquartile range (IQR)93.964747

Descriptive statistics

Standard deviation81.365462
Coefficient of variation (CV)0.70286853
Kurtosis1.4182253
Mean115.76199
Median Absolute Deviation (MAD)46.97033
Skewness1.3343998
Sum1967953.9
Variance6620.3384
MonotonicityNot monotonic
2024-05-31T20:19:43.263481image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
118.6685391 2
 
< 0.1%
138.8255457 1
 
< 0.1%
29.30561793 1
 
< 0.1%
33.35866387 1
 
< 0.1%
29.19656812 1
 
< 0.1%
30.01070523 1
 
< 0.1%
33.26690379 1
 
< 0.1%
35.63336167 1
 
< 0.1%
29.68300563 1
 
< 0.1%
25.71834393 1
 
< 0.1%
Other values (16989) 16989
99.9%
ValueCountFrequency (%)
19.81227397 1
< 0.1%
20.64483672 1
< 0.1%
20.65522439 1
< 0.1%
20.87661119 1
< 0.1%
20.98268205 1
< 0.1%
21.04814379 1
< 0.1%
21.05406118 1
< 0.1%
21.08699578 1
< 0.1%
21.14858245 1
< 0.1%
21.21340754 1
< 0.1%
ValueCountFrequency (%)
397.7280238 1
< 0.1%
390.3577685 1
< 0.1%
389.8645136 1
< 0.1%
388.9105955 1
< 0.1%
387.9390981 1
< 0.1%
384.5037659 1
< 0.1%
383.8978629 1
< 0.1%
383.3720095 1
< 0.1%
383.0744109 1
< 0.1%
383.0493873 1
< 0.1%

Phosphorus
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean142.89053
Minimum13.236823
Maximum477.00284
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:43.484507image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum13.236823
5-th percentile45.062563
Q186.313741
median134.48448
Q3198.65566
95-th percentile245.38173
Maximum477.00284
Range463.76601
Interquartile range (IQR)112.34192

Descriptive statistics

Standard deviation71.153388
Coefficient of variation (CV)0.49795733
Kurtosis-0.0054391353
Mean142.89053
Median Absolute Deviation (MAD)51.149177
Skewness0.56554316
Sum2429139
Variance5062.8046
MonotonicityNot monotonic
2024-05-31T20:19:43.706884image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
165.4874125 1
 
< 0.1%
224.3804806 1
 
< 0.1%
228.6030701 1
 
< 0.1%
229.1924432 1
 
< 0.1%
229.9615521 1
 
< 0.1%
223.0155202 1
 
< 0.1%
229.3995022 1
 
< 0.1%
235.7265279 1
 
< 0.1%
225.032261 1
 
< 0.1%
216.6900574 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
13.23682297 1
< 0.1%
15.36988274 1
< 0.1%
15.58144049 1
< 0.1%
16.28096496 1
< 0.1%
16.40208262 1
< 0.1%
16.97727066 1
< 0.1%
17.29087416 1
< 0.1%
17.63065172 1
< 0.1%
17.81170945 1
< 0.1%
17.91985515 1
< 0.1%
ValueCountFrequency (%)
477.0028373 1
< 0.1%
435.4181763 1
< 0.1%
417.8115731 1
< 0.1%
417.586671 1
< 0.1%
409.4201346 1
< 0.1%
405.0722946 1
< 0.1%
404.4013295 1
< 0.1%
403.2087305 1
< 0.1%
402.4021175 1
< 0.1%
401.9227582 1
< 0.1%

Potassium
Real number (ℝ)

HIGH CORRELATION  UNIQUE 

Distinct17000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean73.926233
Minimum0.71008109
Maximum577.93062
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size132.9 KiB
2024-05-31T20:19:43.928204image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Quantile statistics

Minimum0.71008109
5-th percentile2.1961293
Q17.0601317
median17.98645
Q3125.78686
95-th percentile302.13776
Maximum577.93062
Range577.22054
Interquartile range (IQR)118.72673

Descriptive statistics

Standard deviation112.12407
Coefficient of variation (CV)1.5167021
Kurtosis5.2252838
Mean73.926233
Median Absolute Deviation (MAD)13.201848
Skewness2.2140175
Sum1256746
Variance12571.807
MonotonicityNot monotonic
2024-05-31T20:19:44.129410image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
339.7452031 1
 
< 0.1%
8.914419716 1
 
< 0.1%
7.175068504 1
 
< 0.1%
7.079378588 1
 
< 0.1%
6.628401898 1
 
< 0.1%
8.224105792 1
 
< 0.1%
7.177859189 1
 
< 0.1%
9.115489452 1
 
< 0.1%
8.593476643 1
 
< 0.1%
8.149366032 1
 
< 0.1%
Other values (16990) 16990
99.9%
ValueCountFrequency (%)
0.7100810882 1
< 0.1%
0.8303608028 1
< 0.1%
0.8575093563 1
< 0.1%
0.8698211768 1
< 0.1%
0.887160667 1
< 0.1%
0.8947396679 1
< 0.1%
0.9291715407 1
< 0.1%
0.929262724 1
< 0.1%
0.9318240701 1
< 0.1%
0.9384698351 1
< 0.1%
ValueCountFrequency (%)
577.9306215 1
< 0.1%
575.4901503 1
< 0.1%
571.8025448 1
< 0.1%
569.514464 1
< 0.1%
569.067595 1
< 0.1%
565.4592406 1
< 0.1%
562.9906286 1
< 0.1%
562.6452046 1
< 0.1%
561.6756677 1
< 0.1%
561.4696497 1
< 0.1%

Yield
Categorical

HIGH CARDINALITY  IMBALANCE 

Distinct5004
Distinct (%)29.4%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
low_acidic
5810 
neutral
3557 
acidic
1500 
low_alkaline
1133 
28.98586919
 
1
Other values (4999)
4999 

Length

Max length12
Median length11
Mean length9.4129412
Min length6

Characters and Unicode

Total characters160020
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5000 ?
Unique (%)29.4%

Sample

1st row63.20371897
2nd row61.7722084
3rd row62.87869779
4th row60.51065445
5th row62.05118102

Common Values

ValueCountFrequency (%)
low_acidic 5810
34.2%
neutral 3557
20.9%
acidic 1500
 
8.8%
low_alkaline 1133
 
6.7%
28.98586919 1
 
< 0.1%
29.64918759 1
 
< 0.1%
29.52015332 1
 
< 0.1%
29.45671775 1
 
< 0.1%
29.23637793 1
 
< 0.1%
28.98149912 1
 
< 0.1%
Other values (4994) 4994
29.4%

Length

2024-05-31T20:19:44.340677image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
low_acidic 5810
34.2%
neutral 3557
20.9%
acidic 1500
 
8.8%
low_alkaline 1133
 
6.7%
31.82367954 1
 
< 0.1%
62.11884577 1
 
< 0.1%
62.87869779 1
 
< 0.1%
60.51065445 1
 
< 0.1%
62.05118102 1
 
< 0.1%
60.3529584 1
 
< 0.1%
Other values (4994) 4994
29.4%

Most occurring characters

ValueCountFrequency (%)
i 15753
 
9.8%
c 14620
 
9.1%
a 13133
 
8.2%
l 12766
 
8.0%
d 7310
 
4.6%
o 6943
 
4.3%
w 6943
 
4.3%
_ 6943
 
4.3%
2 6553
 
4.1%
3 5494
 
3.4%
Other values (15) 63562
39.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 98652
61.6%
Decimal Number 49425
30.9%
Connector Punctuation 6943
 
4.3%
Other Punctuation 5000
 
3.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 15753
16.0%
c 14620
14.8%
a 13133
13.3%
l 12766
12.9%
d 7310
7.4%
o 6943
7.0%
w 6943
7.0%
e 4690
 
4.8%
n 4690
 
4.8%
u 3557
 
3.6%
Other values (3) 8247
8.4%
Decimal Number
ValueCountFrequency (%)
2 6553
13.3%
3 5494
11.1%
6 5256
10.6%
1 5093
10.3%
5 4884
9.9%
9 4853
9.8%
4 4787
9.7%
8 4583
9.3%
7 4361
8.8%
0 3561
7.2%
Connector Punctuation
ValueCountFrequency (%)
_ 6943
100.0%
Other Punctuation
ValueCountFrequency (%)
. 5000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 98652
61.6%
Common 61368
38.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 15753
16.0%
c 14620
14.8%
a 13133
13.3%
l 12766
12.9%
d 7310
7.4%
o 6943
7.0%
w 6943
7.0%
e 4690
 
4.8%
n 4690
 
4.8%
u 3557
 
3.6%
Other values (3) 8247
8.4%
Common
ValueCountFrequency (%)
_ 6943
11.3%
2 6553
10.7%
3 5494
9.0%
6 5256
8.6%
1 5093
8.3%
. 5000
8.1%
5 4884
8.0%
9 4853
7.9%
4 4787
7.8%
8 4583
7.5%
Other values (2) 7922
12.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 160020
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 15753
 
9.8%
c 14620
 
9.1%
a 13133
 
8.2%
l 12766
 
8.0%
d 7310
 
4.6%
o 6943
 
4.3%
w 6943
 
4.3%
_ 6943
 
4.3%
2 6553
 
4.1%
3 5494
 
3.4%
Other values (15) 63562
39.7%

Category_pH
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
Loamy
8497 
low_acidic
3759 
Sandy loam
2003 
Sandy Loam
1264 
neutral
1241 

Length

Max length10
Median length7
Mean length7.1985882
Min length4

Characters and Unicode

Total characters122376
Distinct characters18
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowlow_acidic
2nd rowlow_acidic
3rd rowlow_acidic
4th rowlow_acidic
5th rowlow_acidic

Common Values

ValueCountFrequency (%)
Loamy 8497
50.0%
low_acidic 3759
22.1%
Sandy loam 2003
 
11.8%
Sandy Loam 1264
 
7.4%
neutral 1241
 
7.3%
Loam 236
 
1.4%

Length

2024-05-31T20:19:44.520640image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-31T20:19:44.732968image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
loamy 8497
41.9%
low_acidic 3759
18.5%
loam 3503
17.3%
sandy 3267
 
16.1%
neutral 1241
 
6.1%

Most occurring characters

ValueCountFrequency (%)
a 20267
16.6%
o 15759
12.9%
m 12000
9.8%
y 11764
9.6%
L 9997
8.2%
c 7518
 
6.1%
i 7518
 
6.1%
d 7026
 
5.7%
l 7003
 
5.7%
n 4508
 
3.7%
Other values (8) 19016
15.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 102086
83.4%
Uppercase Letter 13264
 
10.8%
Connector Punctuation 3759
 
3.1%
Space Separator 3267
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 20267
19.9%
o 15759
15.4%
m 12000
11.8%
y 11764
11.5%
c 7518
 
7.4%
i 7518
 
7.4%
d 7026
 
6.9%
l 7003
 
6.9%
n 4508
 
4.4%
w 3759
 
3.7%
Other values (4) 4964
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
L 9997
75.4%
S 3267
 
24.6%
Connector Punctuation
ValueCountFrequency (%)
_ 3759
100.0%
Space Separator
ValueCountFrequency (%)
3267
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 115350
94.3%
Common 7026
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 20267
17.6%
o 15759
13.7%
m 12000
10.4%
y 11764
10.2%
L 9997
8.7%
c 7518
 
6.5%
i 7518
 
6.5%
d 7026
 
6.1%
l 7003
 
6.1%
n 4508
 
3.9%
Other values (6) 11990
10.4%
Common
ValueCountFrequency (%)
_ 3759
53.5%
3267
46.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 122376
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 20267
16.6%
o 15759
12.9%
m 12000
9.8%
y 11764
9.6%
L 9997
8.2%
c 7518
 
6.1%
i 7518
 
6.1%
d 7026
 
5.7%
l 7003
 
5.7%
n 4508
 
3.7%
Other values (8) 19016
15.5%

Season
Categorical

HIGH CORRELATION 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
Fall
5604 
Spring
5531 
Summer
4864 
Winter
1001 

Length

Max length6
Median length6
Mean length5.3407059
Min length4

Characters and Unicode

Total characters90792
Distinct characters14
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFall
2nd rowSpring
3rd rowSpring
4th rowSpring
5th rowFall

Common Values

ValueCountFrequency (%)
Fall 5604
33.0%
Spring 5531
32.5%
Summer 4864
28.6%
Winter 1001
 
5.9%

Length

2024-05-31T20:19:44.917138image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-31T20:19:45.113075image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
ValueCountFrequency (%)
fall 5604
33.0%
spring 5531
32.5%
summer 4864
28.6%
winter 1001
 
5.9%

Most occurring characters

ValueCountFrequency (%)
r 11396
12.6%
l 11208
12.3%
S 10395
11.4%
m 9728
10.7%
i 6532
7.2%
n 6532
7.2%
e 5865
6.5%
F 5604
6.2%
a 5604
6.2%
p 5531
6.1%
Other values (4) 12397
13.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 73792
81.3%
Uppercase Letter 17000
 
18.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 11396
15.4%
l 11208
15.2%
m 9728
13.2%
i 6532
8.9%
n 6532
8.9%
e 5865
7.9%
a 5604
7.6%
p 5531
7.5%
g 5531
7.5%
u 4864
6.6%
Uppercase Letter
ValueCountFrequency (%)
S 10395
61.1%
F 5604
33.0%
W 1001
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 90792
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 11396
12.6%
l 11208
12.3%
S 10395
11.4%
m 9728
10.7%
i 6532
7.2%
n 6532
7.2%
e 5865
6.5%
F 5604
6.2%
a 5604
6.2%
p 5531
6.1%
Other values (4) 12397
13.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 90792
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 11396
12.6%
l 11208
12.3%
S 10395
11.4%
m 9728
10.7%
i 6532
7.2%
n 6532
7.2%
e 5865
6.5%
F 5604
6.2%
a 5604
6.2%
p 5531
6.1%
Other values (4) 12397
13.7%

Name
Categorical

HIGH CORRELATION  UNIFORM 

Distinct34
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size132.9 KiB
Tomatoes
 
500
Figs
 
500
Apple
 
500
Orange
 
500
Pomegranate
 
500
Other values (29)
14500 

Length

Max length14
Median length10
Mean length7.2352941
Min length4

Characters and Unicode

Total characters123000
Distinct characters36
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTomatoes
2nd rowTomatoes
3rd rowTomatoes
4th rowTomatoes
5th rowTomatoes

Common Values

ValueCountFrequency (%)
Tomatoes 500
 
2.9%
Figs 500
 
2.9%
Apple 500
 
2.9%
Orange 500
 
2.9%
Pomegranate 500
 
2.9%
Peach 500
 
2.9%
Blueberry 500
 
2.9%
Strawberry 500
 
2.9%
Cherries 500
 
2.9%
Eggplants 500
 
2.9%
Other values (24) 12000
70.6%

Length

2024-05-31T20:19:45.283878image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
tomatoes 500
 
2.8%
figs 500
 
2.8%
endive 500
 
2.8%
cress 500
 
2.8%
chard 500
 
2.8%
beet 500
 
2.8%
arugula 500
 
2.8%
green 500
 
2.8%
peas 500
 
2.8%
broccoli 500
 
2.8%
Other values (26) 13000
72.2%

Most occurring characters

ValueCountFrequency (%)
e 16000
 
13.0%
a 11500
 
9.3%
r 11000
 
8.9%
s 7000
 
5.7%
i 6500
 
5.3%
o 6000
 
4.9%
l 6000
 
4.9%
t 5500
 
4.5%
p 5500
 
4.5%
u 4500
 
3.7%
Other values (26) 43500
35.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 105500
85.8%
Uppercase Letter 16500
 
13.4%
Space Separator 1000
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 16000
15.2%
a 11500
10.9%
r 11000
10.4%
s 7000
 
6.6%
i 6500
 
6.2%
o 6000
 
5.7%
l 6000
 
5.7%
t 5500
 
5.2%
p 5500
 
5.2%
u 4500
 
4.3%
Other values (11) 26000
24.6%
Uppercase Letter
ValueCountFrequency (%)
C 3500
21.2%
P 2500
15.2%
A 1500
9.1%
B 1500
9.1%
S 1000
 
6.1%
E 1000
 
6.1%
L 1000
 
6.1%
K 1000
 
6.1%
G 1000
 
6.1%
R 500
 
3.0%
Other values (4) 2000
12.1%
Space Separator
ValueCountFrequency (%)
1000
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 122000
99.2%
Common 1000
 
0.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 16000
 
13.1%
a 11500
 
9.4%
r 11000
 
9.0%
s 7000
 
5.7%
i 6500
 
5.3%
o 6000
 
4.9%
l 6000
 
4.9%
t 5500
 
4.5%
p 5500
 
4.5%
u 4500
 
3.7%
Other values (25) 42500
34.8%
Common
ValueCountFrequency (%)
1000
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 123000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 16000
 
13.0%
a 11500
 
9.3%
r 11000
 
8.9%
s 7000
 
5.7%
i 6500
 
5.3%
o 6000
 
4.9%
l 6000
 
4.9%
t 5500
 
4.5%
p 5500
 
4.5%
u 4500
 
3.7%
Other values (26) 43500
35.4%

Interactions

2024-05-31T20:19:36.078986image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:16.469059image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:18.837717image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:22.090882image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:24.424402image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:26.842138image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:29.129462image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:31.399749image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:33.774167image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:36.333924image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:16.724046image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:19.102954image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:22.344325image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:24.690561image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:27.104272image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:29.377852image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:31.665355image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:34.031375image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:36.605649image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:16.998887image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:19.381173image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:22.623930image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:24.977401image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:27.383524image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:29.644445image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:31.940701image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:34.306852image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:36.862533image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:17.265988image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:19.642363image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:22.881366image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:25.235387image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:27.631580image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:29.891525image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:32.198869image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:34.575099image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:37.122372image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:17.534489image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:19.902261image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:23.139234image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:25.499395image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:27.874765image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:30.138234image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:32.456472image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:34.830379image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:37.389751image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:17.798957image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:20.155642image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:23.385330image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:25.760960image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:28.109730image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:30.378583image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:32.704266image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:35.075921image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:37.562799image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:18.060718image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:21.265526image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:23.647177image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:26.018427image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:28.359850image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:30.620172image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:32.965053image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:35.324047image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:37.739721image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:18.326876image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:21.551090image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:23.915292image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:26.294222image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:28.619998image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:30.881879image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:33.234617image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:35.584543image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:37.901093image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:18.579733image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:21.820895image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:24.166710image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:26.566104image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:28.876210image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:31.138447image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:33.506694image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
2024-05-31T20:19:35.825970image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/

Correlations

2024-05-31T20:19:45.452646image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
TemperatureRainfallpHLight_HoursLight_IntensityRhNitrogenPhosphorusPotassiumSoil_TypeFertilityPhotoperiodCategory_pHSeasonName
Temperature1.000-0.5840.6540.636-0.6320.484-0.4860.384-0.7570.4250.5240.5380.5140.2280.623
Rainfall-0.5841.000-0.634-0.6530.510-0.5010.438-0.3820.6260.5780.4640.3530.4680.1710.671
pH0.654-0.6341.0000.754-0.6390.514-0.3940.548-0.6050.3700.5620.4700.4310.1910.600
Light_Hours0.636-0.6530.7541.000-0.6460.517-0.3460.515-0.6090.4690.5340.5970.5150.2060.751
Light_Intensity-0.6320.510-0.639-0.6461.000-0.5350.183-0.3670.5910.5110.4620.3320.4600.1140.469
Rh0.484-0.5010.5140.517-0.5351.000-0.0990.385-0.5240.4470.4620.6100.4270.2310.821
Nitrogen-0.4860.438-0.394-0.3460.183-0.0991.000-0.0400.4100.4170.4420.4720.3380.2340.747
Phosphorus0.384-0.3820.5480.515-0.3670.385-0.0401.000-0.2080.4090.3740.4660.3530.2070.688
Potassium-0.7570.626-0.605-0.6090.591-0.5240.410-0.2081.0000.6320.5200.3350.4720.1330.709
Soil_Type0.4250.5780.3700.4690.5110.4470.4170.4090.6321.0000.6580.5040.5330.2390.999
Fertility0.5240.4640.5620.5340.4620.4620.4420.3740.5200.6581.0000.7540.5240.2250.999
Photoperiod0.5380.3530.4700.5970.3320.6100.4720.4660.3350.5040.7541.0000.6040.3330.999
Category_pH0.5140.4680.4310.5150.4600.4270.3380.3530.4720.5330.5240.6041.0000.2300.842
Season0.2280.1710.1910.2060.1140.2310.2340.2070.1330.2390.2250.3330.2301.0000.605
Name0.6230.6710.6000.7510.4690.8210.7470.6880.7090.9990.9990.9990.8420.6051.000

Missing values

2024-05-31T20:19:38.201151image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-31T20:19:38.640005image/svg+xmlMatplotlib v3.6.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Soil_TypeFertilityPhotoperiodN-P-K RatioTemperatureRainfallpHLight_HoursLight_IntensityRhNitrogenPhosphorusPotassiumYieldCategory_pHSeasonName
0Sandy LoamHighShort Day Period10:10:1021.040031487.3250736.5225586.957985632.01353154.621752138.825546165.487413339.74520363.20371897low_acidicFallTomatoes
1Sandy LoamHighShort Day Period10:10:1026.488870504.2661095.5460696.837064554.68401651.988385140.493337163.022717310.91725961.7722084low_acidicSpringTomatoes
2Sandy LoamHighShort Day Period10:10:1026.080255517.9321766.2080116.524897544.83934157.867416146.952909169.138527309.75111662.87869779low_acidicSpringTomatoes
3Sandy LoamHighShort Day Period10:10:1024.915214481.2715186.4551996.580255390.21774754.299459138.305047153.326427309.02451860.51065445low_acidicSpringTomatoes
4Sandy LoamHighShort Day Period10:10:1026.567407486.1193396.0854076.344331627.75448757.001604145.647511159.974655320.07246462.05118102low_acidicFallTomatoes
5Sandy LoamHighShort Day Period10:10:1023.847680513.7492696.4311177.115014420.97436854.246641130.570236161.564887307.44513460.3529584low_acidicSpringTomatoes
6Sandy LoamHighShort Day Period10:10:1024.534518498.3651225.8871846.798224528.24455556.928991135.641485151.867730335.18819262.96572242low_acidicSpringTomatoes
7Sandy LoamHighShort Day Period10:10:1023.277702488.3053866.4736786.463716607.10852655.382666141.960565160.293696306.66622760.84003294low_acidicFallTomatoes
8Sandy LoamHighShort Day Period10:10:1023.213803452.9206476.2931856.789009380.15744455.657337138.366051161.055477333.04186459.55537707low_acidicFallTomatoes
9Sandy LoamHighShort Day Period10:10:1024.706619481.3640215.9643997.492743506.33843255.820634140.946405161.635292297.51901361.05797026low_acidicSummerTomatoes
Soil_TypeFertilityPhotoperiodN-P-K RatioTemperatureRainfallpHLight_HoursLight_IntensityRhNitrogenPhosphorusPotassiumYieldCategory_pHSeasonName
16990moderateShort Day Period, Day Neutral, Long Day Period10-10-201023:10:041005.7230343.25753313.628329294.09934292.928788108.399910236.417753234.5535546.105020acidicLoamySummerplum
16991moderateShort Day Period, Day Neutral, Long Day Period10-10-20108:58:24966.2805323.48141113.783428287.38367293.560600108.848441232.282393248.8611616.290038acidicLoamySummerplum
16992moderateShort Day Period, Day Neutral, Long Day Period10-10-201011:44:27936.9884923.50043212.609734325.93470292.904111123.304621238.110325248.1914615.316284acidicLoamySummerplum
16993moderateShort Day Period, Day Neutral, Long Day Period10-10-201016:10:051011.1805173.59500814.085374318.87034893.439182132.543185236.344536234.6692216.481444acidicLoamySummerplum
16994moderateShort Day Period, Day Neutral, Long Day Period10-10-20107:43:441053.8247793.47108811.374036328.41421993.326291115.531954240.840457227.7910266.694561acidicLoamySummerplum
16995moderateShort Day Period, Day Neutral, Long Day Period10-10-201022:37:24982.6568123.64967413.128135314.76559692.949162117.886962208.918392228.1756705.816694acidicLoamySummerplum
16996moderateShort Day Period, Day Neutral, Long Day Period10-10-20105:31:121023.0364443.08712912.321330353.86590892.720948121.648026241.502352251.7700006.003023acidicLoamySummerplum
16997moderateShort Day Period, Day Neutral, Long Day Period10-10-201018:58:211017.9850193.56709012.602164329.16548991.049162120.002978212.068106256.0898437.150471acidicLoamySummerplum
16998moderateShort Day Period, Day Neutral, Long Day Period10-10-20109:21:421013.2875633.58504713.157985328.06866092.127418120.869825248.331740233.6417917.482319acidicLoamySummerplum
16999moderateShort Day Period, Day Neutral, Long Day Period10-10-201018:19:201033.7831493.75724612.261236356.46223991.780295132.107441239.053768236.7621906.479093acidicLoamySummerplum